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ABSTRACT. This paper gives a brief overview of computation models for data stream process- 
ing, and it introduces a new model for multi-pass processing of multiple streams, the so-called 
mp2s-automata. Two algorithms for solving the set disjointness problem with these automata are 
presented. The main technical contribution of this paper is the proof of a lower bound on the size 
of memory and the number of heads that are required for solving the set disjointness problem with 
mp2s-automata. 
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1. Introduction 

In the basic data stream model, the input consists of a stream of data items which can be read 
only sequentially, one after the other. For processing these data items, a memory buffer of limited 
size is available. When designing data stream algorithms, one aims at algorithms whose memory 
size is far smaller than the size of the input. 

Typical application areas for which data stream processing is relevant are, e.g., IP network 
traffic analysis, mining text message streams, or processing meteorological data generated by sensor 
networks. Data stream algorithms are also used to support query optimization in relational database 
systems. In fact, virtually all query optimization methods in relational database systems rely on 
information about the number of distinct values of an attribute or the self-join size of a relation — 
and these pieces of information have to be maintained while the database is updated. Data stream 
algorithms for accomplishing this task have been introduced in the seminal paper fZ\ . 

Most parts of the data stream literature deal with the task of performing one pass over a single 
stream. For a detailed overview on algorithmic techniques for this scenario we refer to ll23l . Lower 
bounds on the size of memory needed for solving a problem by a one-pass algorithm are usually 
obtained by applying methods from communication complexity (see, e.g., (2l|20l). In fact, for many 
concrete problems it is known that the memory needed for solving the problem by a deterministic 
one-pass algorithm is at least linear in the size n of the input. For some of these problems, however, 
randomized one-pass algorithms can still compute good approximate answers while using memory 
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of size sublinear in n. Typically, such algorithms are based on sampling, i.e., only a "representative" 
portion of the data is taken into account, and random projections, i.e., only a rough "sketch" of 
the data is stored in memory. See [231 [TOl for a comprehensive survey of according algorithmic 
techniques and for pointers to the literature. 

Also the generalization where multiple passes over a single stream are performed, has re- 
ceived considerable attention in the literature. Techniques for proving lower bounds in this scenario 
can be found, e.g., in EBUDDSiniim. 

A few articles also deal with the task of processing several streams in parallel. For example, 
the authors of [28 ] consider algorithms which perform one pass over several streams. They introduce 
a new model of multi-party communication complexity that is suitable for proving lower bounds on 
the amount of memory necessary for one-pass algorithms on multiple streams. In [28], these results 
are used for determining the exact space complexity of processing particular XML twig queries. 
In recent years, the database community has also addressed the issue of designing general-purpose 
data stream management systems and query languages that are suitable for new application areas 
where multiple data streams have to be processed in parallel. To get an overview of this research 
area, is a good starting point. Foundations for a theory of stream queries have been laid in 
|[T9l . Stream-based approaches have also been examined in detail in connection with XML query 
processing and validation, see, e.g. the papers |[T7ll26l[T3l [8ll4ll5l[T6l. 

The, finite cursor machines (FCMs, for short) of [14] are a computation model for performing 
multiple passes over multiple streams. FCMs were introduced as an abstract model of database 
query processing. Formally, they are defined in the framework of abstract state machines. Infor- 
mally, they can be described as follows: The input for an FCM is a relational database, each relation 
of which is represented by a table, i.e., an ordered list of rows, where each row corresponds to a 
tuple in the relation. Data elements are viewed as "indivisible" objects that can be manipulated by 
a number of "built-in" operations. This feature is very convenient to model standard operations on 
data types like integers, floating point numbers, or strings, which may all be part of the universe of 
data elements. FCMs can operate in a finite number of modes using an internal memory in which 
they can store bitstrings. They access each relation through a finite number of cursors, each of 
which can read one row of a table at any time. The model incorporates certain streaming or sequen- 
tial processing aspects by imposing a restriction on the movement of the cursors: They can move 
on the tables only sequentially in one direction. Thus, once the last cursor has left a row of a table, 
this row can never be accessed again during the computation. Note, however, that several cursors 
can be moved asynchronously over the same table at the same time, and thus, entries in different, 
possibly far apart, regions of the table can be read and processed simultaneously. 

A common feature of the computation models mentioned so far in this paper is that the in- 
put streams are read-only streams that cannot be modified during a pass. Recently, also stream- 
based models for external memory processing have been proposed, among them the StrSort 
model |[T][24l, the W-Stream model [11], and the model of read/write streams (T7\ [T6l [131 171 l6l. 
In these models, several passes may be performed over a single stream or over several streams in 
parallel, and during a pass, the content of the stream may be modified. 

A detailed introduction to algorithms on data streams, respectively, to the related area of sub- 
linear algorithms can be found in H231I1011 . A survey of stream-based models for external memory 
processing and of methods for proving lower bounds in these models is given in 11251 . A database 
systems oriented overview of so-called data stream systems can be found in [3]. For a list of open 
problems in the area of data streams we refer to [21 J. 
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In the remainder of this article, a new computation model for multi-pass processing of multiple 
streams is introduced: the mp2s-automata. In this model, (read-only) streams can be processed by 
forward scans as well as backward scans, and several "heads" can be used to perform several passes 
over the streams in parallel. After fixing the basic notation in Section |2 the computation model of 
mp2s-automata is introduced in Section [3] In Section HI we consider the set disjointness problem 
and prove upper bounds as well as lower bounds on the size of memory and the number of heads 
that are necessary for solving this problem with an mp2s-automaton. Section [5] concludes the paper 
by pointing out some directions for future research. 

2. Basic notation 

If / is a function from the set of non-negative integers to the set of reals, we shortly write 
f(n) instead of [/(n)] (where \x] denotes the smallest integer ^ x). We write lgn to denote the 
logarithm of n with respect to base 2. For a set B we write B* to denote the set of all finite strings 
over alphabet B. We view B* as the set of all finite data streams that can be built from elements in 
B. For a stream S £ B* write |5| to denote the length of S, and we write Si to denote the element 
in B that occurs at the z-th position in S, i.e., S = s±S2 • • • s,a. 

3. A computation model for multi-pass processing of multiple streams 

In this section, we fix a computation model for multi-pass processing of multiple streams. The 
model is quite powerful: Streams can be processed by forward scans as well as backward scans, and 
several "heads" can be used to perform several passes over the stream in parallel. For simplicity, we 
restrict attention to the case where just two streams are processed in parallel. Note, however, that it 
is straightforward to generalize the model to an arbitrary number of streams. 

The computation model, called mp2s-automata^l can be described as follows: Let B be a set, 
and let m,k f , k b be integers with m ^ 1 and k f , k b ^ 0. An 

mp2s-automaton A with parameters (B, m, k f , k b ) 

receives as input two streams S G B* and T € B*. The automaton's memory consists of m different 
states (note that this corresponds to a memory buffer consisting of lg m bits). The automaton's state 
space is denoted by Q. We assume that Q contains a designated start state and that there is a 
designated subset F of Q of so-called accepting states. 

On each of the input streams S and T, the automaton has k f heads that process the stream 
from left to right (so-called forward heads) and k b heads that process the stream from right to left 
(so-called backward heads). The heads are allowed to move asynchronously. We use k to denote 
the total number of heads, i.e., k = 2k f + 2k b . 

In the initial configuration of A on input (S, T), the automaton is in the start state, all forward 
heads on S and T are placed on the leftmost element in the stream, i.e., s± resp. t\, and all backward 
heads are placed on the rightmost element in the stream, i.e., s,g, resp. £,^?,. 

During each computation step, depending on (a) the current state (i.e., the current content of the 
automaton's memory) and (b) the elements of S and T at the current head positions, a deterministic 
transition function determines (1) the next state (i.e., the new content of the automaton's memory) 
and (2) which of the k heads should be advanced to the next position (where forward heads are 

1 "mp2s " stands for rnulti-pass processing of 2 streams 
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D : set of data items of which input streams S and T are composed 

m : size of the automaton's state space Q (this corresponds to lg m bits of memory) 

k f : number of forward heads available on each input stream 

k b : number of backward heads available on each input stream 

k : 2k f + 2k b (total number of heads) 

Figure 1: The meaning of the parameters (D, m, k f , k b ) of an mp2s-automaton. 

advanced one step to the right, and backward heads are advanced one step to the left). Formally, the 
transition function can be specified in a straightforward way by a function 

5 : Q x (DU {end}) k — ► Q x {advance, stay} k 

where Q denotes the automaton's state space, and end is a special symbol (not belonging to D) 
which indicates that a head has reached the end of the stream (for a forward head this means that 
the head has been advanced beyond the rightmost element of the stream, and for a backward head 
this means that the head has been advanced beyond the leftmost element of the stream). 

The automaton's computation on input (S, T) ends as soon as each head has passed the entire 
stream. The input is accepted if the automaton's state then belongs to the set F of accepting states, 
and it is rejected otherwise. 

The computation model of mp2s-automata is closely related to the finite cursor machines of 
Ifl4l . In both models, several streams can be processed in parallel, and several heads (or, "cur- 
sors") may be used to perform several "asynchronous" passes over the same stream in parallel. In 
contrast to the mp2s-automata of the present paper, finite cursor machines were introduced as an 
abstract model for database query processing, and their formal definition in [14] is presented in the 
framework of abstract state machines. 

Note that mp2s-automata can be viewed as a generalization of other models for one-pass or 
multi-pass processing of streams. For example, the scenario of f28l . where a single pass over two 
streams is performed, is captured by an mp2s-automaton where 1 forward head and no backward 
heads are available on each stream. Also, the scenario where p consecutive passes of each input 
stream are available (cf., e.g., l20l ). can be implemented by an mp2s-automaton: just use p forward 
heads and backward heads, and let the i-th head wait at the first position of the stream until the 
(£— l)-th head has reached the end of the stream. 

4. The set disjointness problem 

Throughout Section 0] we consider a particular version of the set disjointness problem where, 
for each integer n ^ 1, D„ := { ai, bi, . . . , a n , b n } is a fixed set of 2n data items. We write 
Disj n to denote the following decision problem: The input consists of two streams S and T over 
D n with 1 5 1 = \T\ = n. The goal is to decide whether the sets {si, . . . , s n } and {ti, . . . , t n } are 
disjoint. 

An mp2s-automaton solves the problem Disj n if, for all valid inputs to Disj n (i.e., all S,T6D* 
with 1 5 1 = \T\ = n), it accepts the input if, and only if, the corresponding sets are disjoint. 
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4.1. Two upper bounds for the set disjointness problem 

It is straightforward to see that the problem Disj n can be solved by an mp2s-automaton with 
2 2n states and a single forward head on each of the two input streams: During a first phase, the 
head on S processes S and stores, in the automaton's current state, the subset of B n that has been 
seen while processing S. Afterwards, the head on T processes T and checks whether the element 
currently seen by this head belongs to the subset of D n that is stored in the automaton's state. 
Clearly, 2 2n states suffice for this task, since |D„| = In. We thus obtain the following trivial upper 
bound: 

Proposition 4.1. Disj n can be solved by an mp2s-automaton with parameters (D n , 2 2n , 1, 0). 

The following result shows that, at the expense of increasing the number of forward heads on 
each stream to yjn, the memory consumption can be reduced exponentially: 

Proposition 4.2. Disj n can be solved by an mp2s-automaton with parameters (B n , n+2, ^/n, 0)J1 

Proof. The automaton proceeds in two phases. 

The goal in Phase 1 is to move, for each i G {1, . . . , \/n}, the i-th head on S onto the 
[(i—l)y/n + l)-th position in S. This way, after having finished Phase 1, the heads partition 
S into yfn sub-streams, each of which has length i/n. Note that n + 1 — y/n states suffice for 
accomplishing this: The automaton simply stores, in its state, the current position of the rightmost 
head(s) on S. It starts by leaving head 1 at position 1 and moving the remaining heads on S to the 
right until position y/n + 1 is reached. Then, it leaves head 2 at position ^fn + 1 and proceeds by 
moving the remaining heads to the right until position l^fn + 1 is reached, etc. 

During Phase 2, the automaton checks whether the two sets are disjoint. This is done in y/n sub- 
phases. During the j-th sub-phase, the j-th head on T processes T from left to right and compares 
each element in T with the elements on the current positions of the sfn heads on S. When the j-th 
head on T has reached the end of the stream, each of the heads on S is moved one step to the right. 
This finishes the j-th sub-phase. Note that Phase 2 can be accomplished by using just 2 states: By 
looking at the combination of heads on T that have already passed the entire stream, the automaton 
can tell which sub-phase it is currently performing. Thus, for Phase 2 we just need one state for 
indicating that the automaton is in Phase 2, and an additional state for storing that the automaton 
has discovered already that the two sets are not disjoint. ■ 

4.2. Two lower bounds for the set disjointness problem 

We first show a lower bound for mp2s-automata where only forward heads are available: 

Theorem 4.3. For all integers n, m, k f , such that, for k = 2k f and v = kZ + 1, 

k 2 ■ v ■ lg(n+l) + k ■ v ■ lg m + v ■ (1 + lg v) ^ n , 
the problem Disj n cannot be solved by any mp2s-automaton with parameters (D ra , m,k f ,0). 

Proof. Let n, m, and k s be chosen such that they meet the theorem's assumption. For contradiction, 
let us assume that A is an mp2s-automaton with parameters (B n , m, k f , 0) that solves the problem 

2 To be precise, the proof shows that already n + 2 — -Jn states suffice. 
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Recall that D n = { ai, b%, . . . , a n , b n } is a fixed set of 2n data items. Throughout the proof 
we will restrict attention to input streams S and T which are enumerations of the elements in a set 

A 1 := {ai-.iel} U {bi : i G 7} 

for arbitrary I C {1, . . , n} and its complement I := {1, . . , n} \ /. 
Note that for all I\, I2 C {1, . . , n} we have 

A h and A h are disjoint <^=>- I 2 = h- (4.1) 

For each 7C {1, . . , n} we let S 1 be the stream of length n which is defined as follows: For each 
i G /, it carries data item at position i; and for each i G" /, it carries data item 6j at position i. The 
stream T 7 contains the same data items as S 1 , but in the opposite order: For each i G /, it carries 
data item ai at position n — i + 1; and for each i G" /, it carries data item 6j at position n — i + 1. 

For sets Ii, J2 ^ {1, . . , n}, we write D(I\, I2) to denote the input instance S !l and T 1 " 2 for the 
problem Disj n . From (14.11) and our assumption that the mp2s-automaton A solves Disj n , we obtain 
that 

A accepts D(Ix,I 2 ) < ^ h = h- (4.2) 
Throughout the remainder of this proof, our goal is to find two sets 1,1' C {1, . . , n} such that 

(1) / / I', and 

(2) the accepting run of A on D(I, I) is "similar" to the accepting run of A on D(I', I'), so 
that the two runs can be combined into an accepting run of A on D(I, I') (later on in the 
proof, we will see what "similar" precisely means). 

Then, however, the fact that A accepts input D(I, I') contradicts (14.21) and thus finishes the proof 
of Theorem 14.3 1 

For accomplishing this goal, we let 

v := k) + l (4.3) 

be 1 plus the number of pairs of heads on the two streams. We subdivide the set {1, . . , n} into v 
consecutive blocks Bi,...,B v of equal size -. I.e., for each j G {1, . . , v }, block Bj consists of 
the indices in { (j-l)f + 1, 

We say that a pair (hs, /it) of heads of A checks block Bj during the run on input D(I%, I2) 
if, and only if, at some point in time during the run, there exist G Bj such that head h$ is on 
element ai or bi in S Tl and head hx is on element or 6j/ in T l2 . 

Note that each pair of heads can check at most one block, since only forward heads are available 
and the data items in T 1 ' 2 are arranged in the reverse order (with respect to the indices i of elements 
ai and bi) than in S 1 . Since there are v blocks, but only v — 1 pairs (/15, hx) of heads on the two 
streams, we know that for each I\,l2 C {1, . . , n} there exists a block Bj that is not checked during 
.A's run on D(Ii,l2). 

In the following, we determine a set X C {/ : / C {1, . . , n}} with \X\ ^2 such that for all 
/, I' G X, item (2) of our goal is satisfied. We start by using a simple averaging argument to find a 
jo G {1, . . , v} and a set Xo C {J : I C {1, . . , n}} such that 

• for each I G Xo, block Bj is not checked during A's run on input D(I, I), and 

• |*o| > 2 i- 
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For the remainder of the proof we fix B := Bj . 

We next choose a sufficiently large set X\ C Xq in which everything outside block B is fixed: 
A simple averaging argument shows that there is a X\ C Xq and a/C {1, . . ,n} \ B such that 

• for each I G X\, I\B = I, and 

. I v I \ [-Xol >, n — — lev 

• \Xi\ ^ ^ 2" s ■ 

We next identify a set X 2 C Xi such that for all J, J' G X 2 the runs of .4 on D(1, 7) and D(I',T) 
are "similar" in a sense suitable for item (2) of our goal. To this end, for each head h of A we let 
config^ be the configuration (i.e., the current state and the absolute positions of all the heads) in the 
run of A on input D(I, I) at the particular point in time where head h has just left block B (i.e., 
head h has just left the last element m or bi with i G B that it can access). We let config 1 be the 
ordered tuple of the configurations config^ for all heads h of A. Note that the number of possible 
configurations config^ is ^ m ■ (n+l) k , since A has m states and since each of the k = 2k f 
heads can be at one out of n+1 possible positions in its input stream. Consequently, the number of 
possible /c-tuples config 1 of configurations is ^ (m • (n+l) fe ) fc . 

A simple averaging argument thus yields a tuple c of configurations and a set X 2 C X\ such that 

• for all I G X2, config 1 = c, and 

iv- I > \Xi\ > 9 2_l g „ _ fclgm - fc 2 lg(n+l) 

l A 2| ^ (m-(n+l) k ) k ^ Z 

Using the theorem's assumption on the numbers n, m, and fc /5 one obtains that \X 2 \ ^2. Therefore, 
we can find two sets I, I' G X 2 with / 7^ J'. 

To finish the proof of Theorem I4.3L it remains to show that the runs of A on D(I, I) and on 
£>(/', I') can be combined into a run of A on D(I, I') such that ^4 (falsely) accepts input D(I, I'). 
To this end let us summarize what we know about / and /' in X 2 : 

(a) / and I' only differ in block B. 

(b) Block B is not checked during A's runs on D(I, I) and on D(I', I 1 ). I.e., while any head on 
S 1 (resp. S 1 ') is at an element ctj or 6j with i G i?, no head on (resp. T 7 ') is on an element 
aii or bi> with i' G B. 

(c) Considering ,4's runs on D(I, I) and on D(I' , I'), each time a head leaves the last position 
in B that it can access, both runs are are in exactly the same configuration. I.e., they are in 
the same state, and all heads are at the same absolute positions in their input streams. 

Due to item (a), ^4's run on input D(I,F) starts in the same way as the runs on D(I,I) and 
D(I' , I'): As long as no head has reached an element in block B, the automaton has not yet seen 
any difference between D(I, I') on the one hand and D(I, I) and D(I', I') on the other hand. 

At some point in time, however, some head h will enter block B, i.e., it will enter the first 
element aj or bi with i G B that it can access. The situation then is as follows: 

• If h is a head on S 1 , then, due to item (b), no head on T 1 ' is at an element in B. Therefore, 
until head h leaves block B, A will go through the same sequence of configurations as in 
its run on input D(I, I). Item (c) ensures that when h leaves block B, A is in the same 
configuration as in its runs on D(I, I) and on D(I' , I'). 
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• Similarly, if h is a head on T 1 ' , then, due to item (b), no head on S 1 is at an element 
in B. Therefore, until head h leaves block B, A will go through the same sequence of 
configurations as in its run on input D(f, I'). Item (c) ensures that when h leaves block B, 
A is in the same configuration as in its runs on D(I',I') and on D(I, I). 

In summary, in .A's run on D(I, I'), each time a head h has just left the last element in block B that 
it can access, it is in exactly the same configuration as in „4's runs on D(I, I) and on D(I', I') at 
the points in time where head h has just left the last element in block B that it can access. After 
the last head has left block B, A's run on D(I, I') finishes in exactly the same way as „4's runs 
on D(I,1) and D(I',T). In particular, it accepts D(I,T) (since it accepts D(I,I) and D(I',T)). 
This, however, is a contradiction to (14.21) . Thus, the proof of Theorem l4.3l is complete. ■ 

Remark 4.4. Let us compare the lower bound from Theorem 14.31 with the upper bound of Proposi- 
tion [4721 The upper bound tells us that Disj n can be solved by an mp2s-automaton with n+2 states 
and ^fn forward heads on each input stream. The lower bound implies (for large enough n) that 
if just ^fn forward heads are available on each stream, not even 2 v/ ™ states suffice for solving the 
problem Disj n with an mp2s-automaton. 

Remark 4.5. A straightforward calculation shows that the assumptions of Theorem |4.3| are satisfied, 
for example, for all sufficiently large integers n and all integers m and k f with ikf ^ \Jrin an< ^ 

Theorem 14.31 can be generalized to the following lower bound for mp2s-automata where also 
backward heads are available: 

Theorem 4.6. For all n, m, k f , k b such that, for k = 2k f + 2k b and v = (kj + kf + 1) ■ (2k f k b + l), 

k 2 ■ v ■ lg(n+l) + k ■ v ■ lgm + v ■ (1 + lgu) ^ n, 

the problem Disj n cannot be solved by any mp2s-automaton with parameters (B n , m, k f , k b ). 

Proof. The overall structure of the proof is the same as in the proof of Theorem 14.31 We consider 
the same sets A 1 , for all / C {1, . . , n}. The stream S 1 is chosen in the same way as in the proof of 
Theorem 14. 3 1 i.e., for each i e I, the stream S 1 carries data item a, at position i; and for each i $ I, 
it carries data item hi at position i. 

Similarly as in the proof of Theorem 14.31 the stream T 1 contains the same data items as S 1 . 
Now, however, the order in which the elements occur in is a bit more elaborate. For fixing this 
order, we choose the following parameters: 

v\ := k 2 f +k 2 + l, v 2 := 2k f k b + 1 , v := v x ■ v 2 . (4.4) 

We subdivide the set {1, . . , n} into v\ consecutive blocks B\, . . . , B v of equal size I.e., for each 

j G {1, . . , vi}, block Bj consists of the indices in { (j— 1)^- + 1, . . . , j^- }. 

Afterwards, we further subdivide each block Bj into v% consecutive subblocks of equal size ^. 

These subblocks are denoted Bj , . . . , Bj 2 . Thus, each subblock f?j consists of the indices in 

{ (j-i)t + O'-^f + !.■•■» 0'-i)sr + i'f }■ 

Now let 7T be the permutation of {l,..,n} which maps, for all j, r with 1 ^ j ^ v\ and 
1 ^ t ^ ^, element + s onto element (vi—j)^ + s. Thus, it maps elements in block 

Bj onto elements in block B Vl ~j+i, and inside these two blocks, ir maps the elements of subblock 
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Bj onto elements in subblock B J Vi _ 1+1 . Note that ir reverses the blocks Bj in order, but it does not 

reverse the order of the subblocks Bj . 

Finally, we are ready to fix the order in which the elements in A 1 occur in the stream T 1 : For 
each i 6 7, the stream T 1 carries data item a-i at position vr(i); and for each i ^ I, it carries data 
item bi at position 7r(z). 

In the same way as in the proof of Theorem 14. 3 1 we write D(I\ , I2) to denote the input instance 
S h andf 72 . 

A pair of heads (hs, for) is called mixed if one of the heads is a forward head and the other is 
a backward head. Since n reverses the order of the blocks B\ , . . , B Vl , it is straightforward to see 
that every non-mixed pair of heads can check at most one of the blocks B\ , . . , B Vl . Since there 
are v± blocks, but only (v\ — 1) non-mixed pairs of heads, we know that for all 1%, 1% C {1, . . , n} 
there exists a block Bj that is not checked by any non-mixed pair of heads during .A's run on input 
D(I U I 2 ). 

The same averaging argument as in the proof of Theorem 14.31 thus tells us that there is a j\ £ 
{1, . . , ui} and a set X' C {J : J C {1, . . , re}} such that 

• for each / £ X' , block Bj 1 is not checked by any non-mixed pair of heads during .A's run 
on input D(I, I), and 

I UI ^ Vl 

From our particular choice of tt, it is straightforward to see that every mixed pair of heads can check 
at most one of the subblocks Bj i , . . . , B^. Since there are V2 such subblocks, but only (1*2 — 1) 
mixed pairs of heads, there must be a j'2 £ {1, . . , 1/2} and a set Xq C X' such that 

• for each / £ Xq, subblock B^ is not checked by any pair of heads during ^4's run on input 
D(I,1), and 

I w I ■ — V 2 ^ v 

For the remainder of the proof we fix B := B 3 ^, and we let k := 2k f + 2k b denote the total 
number of heads. Using these notations, the rest of the proof can be taken vertatim from the proof 
of Theorem 14. 3 1 ■ 

The proof of Theorem 14.61 is implicit in Ifl4l (see Theorem 5.11 in Ifl4l "). There, however, the 
proof is formulated in the terminology of a different machine model, the so-called finite cursor 
machines. 



5. Final remarks 

Several questions concerning the computational power of mp2s-automata occur naturally. On 
a technical level, it would be nice to determine the exact complexity of the set disjointness problem 
with respect to mp2s-automata. In particular: Is the upper bound provided by Proposition 14.21 
optimal? Can backward scans significantly help for solving the set disjointness problem? Are yjn 
heads really necessary for solving the set disjointness problem when only a sub-exponential number 
of states are available? 

A more important task, however, is to consider also randomized versions of mp2s-automata, 
to design efficient randomized approximation algorithms for particular problems, and to develop 
techniques for proving lower bounds in the randomized model. 
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